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This  article,  prepared  for  the  Enc.yc.lo- 


pae.cU.ci  o£  StatZAtZcaZ  Sciences  ,  gives  many 
references  to  geological  papers  according  to 
the  statistical  methods  used.  It  should  be 
useful  to  anyone  preparing  a  course  for  geolo 
gists.  It  may  also  aid  statisticians  looking 
for  applications  of  techniques. 
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1.  INTRODUCTION 


Geology  seeks  to  describe  and  understand  the  processes  which  have  acted 
in  the  past,  and  are  acting  now,  to  form  the  continents  and  oceans  with  their 
mountains  and  valley  and  which  have  led  to  the  varied  sequences  of  rocks  of 
differing  compositions  and  structures.  At  the  time  of  Darwin's  1832-1836 
voyage  on  the  BzagZz,  geology  was  closely  linked  with  biology  as  the  study 
of  "natural  history,"  and  both  then  made  great  leaps  forward.  In  fact, 

Lyell's  book  (1830-1832)  on  stratigraphy  (epochs  were  defined  statistically 
by  fossil  contents)  was  Darwin's  inspiration.  It  is  a  curious  historical  fact 
that,  while  the  intense  application  of  the  physical  sciences  led  the  subjects 
to  diverge,  both  had  their  next  revolution  at  about  the  same  time  —  molecular 
biology  in  the  1950's  and  plate  tectonics  in  the  1960's. 

Mathematics  entered  geology  when  the  physics  of  the  earth  was  studied  — 
gravity  and  the  figure  of  the  earth,  tides  in  the  oceans,  air  and  solid  earth, 
the  cooling  of  the  earth,  earthquakes  and  the  propagation  of  waves  around  the 
earth.  Most  study  of  the  earth  must  be  a  matter  of  inference,  because  it  is 
necessarily  indirect  —  only  Jules  Verne  could  imagine  a  "Voyage  to  the  Center 
of  the  Earth."  Also,  geological  field  measurements  are  subject  to  greater 
errors  than  laboratory  work  in  chemistry  and  physics,  and  it  is  often  not 
possible  to  take  "random  samples."  It  is  something  of  a  coincidence  that  a 
little  after  Sir  Ronald  Fisher's  (q.v.)  development  of  statistics  largely  for 
biologists,  an  eminent  geophysicist,  Sir  Harold  Jeffreys,  should  develop  his 
own  theory  of  statistics  (1931-1939).  Jeffreys'  logical  predisposition  led 
him  to  a  mathematical  rule  for  deriving  priors  rather  than  to  use  a  purely 
subjective  origin.  For  his  geological  achievements  and  classical  mathematical 
geophysics,  readers  should  also  consult  his  famous  text,  Thz  EaAth  (1924-1961). 
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The  earth  sciences  may  also  claim  to  have  initiated  several  areas  of  statis¬ 
tical  theory  and  practice.  There  are  so  many  periodic  or  pseudo-periodic 
earth  phenomena  that  Sir  George  Stokes'  (1879)  introduction  of  the  Fourier 
transform  of  data  and  its  development  by  Schuster  --  see,  e.g.,  Brillinger 
(1975)  —  was  natural.  The  most  advanced  applications  of  Time  Series  Analysis 
(q.v.)  are  still  to  be  found  in  geophysics  --  a  comprehensive  bibliography  has 
been  given  by  Tukey  (1965).  The  orientation  of  pebbles  (Krumbein,  1939)  and 
the  direction  of  magnetism  of  rocks  (Fisher,  1953)  led  to  the  development  of 
methods  for  Directional  Data  Analysis  (q.v.);  a  survey  with  many  references 
to  papers  in  this  area  was  given  by  Watson  (1970).  More  recently,  economic 
geology  and  efficient  mining  have  led  to  (Geostatistics  (q.v.))  an  extensive 
application  of  random  function  theory  by  Matheron  (1965).  Chemical  petrolo- 
gists  study  the  proportions  of  substances  making  up  rocks,  so  their  data  add 
to  unity.  The  study  of  the  correlation  of  proportions  raises  special  problems 
that  have  occupied  geologists  more  than  other  scientists  —  see,  e.g.,  Chayes 
(1971).  The  study  of  their  sections  (e.g.,  Chayes,  1956)  has  led  to  stereo- 
logical  and  geometrical  probability  problems  —  as  has  exploration  geology. 
Geologists  have  always  needed  maps  and  photographs  of  sections.  Now  the. computer 
is  being  used  heavily  to  produce  and  process  such  information;  Matheron  (1967) 
provides  a  theoretical  background  --  see  also  Mathematical  Morphology  (q.v.). 

There  has  been  a  very  rapid  growth  in  the  use  of  computers,  mathematics 
and  statistics  in  geology  in  the  1970's.  This  literature  is  fairly  easy  to 
enter.  The  American  Geological  Institute  publishes  a  Bibliography  and  Index 
in  which  most  of  the  relevant  articles  appear  under  the  main  heading  of 
"Automatic  Data  Processing,"  though  some  appear  under  "Mathematical  Geology." 

Two  journals,  Mathematical  Geology  and  Computer  and  Geosciences ,  specializing 
in  these  topics,  began  in  the  70's.  There  are  a  number  of  general  texts  (e.g., 
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Agterberg,  1974;  Davis,  1973)  and  a  number  devoted  to  specific  topics  to  be 
mentioned  below.  D.  F.  Merriam  has  edited  many  symposia.  As  these  quanti¬ 
tative  methods  become  a  recognized  part  of  all  subdivisions  of  geology,  the 
specialized  journals  {Se.dimejntology,  to  give  just  one  example)  all  carry 
articles  of  statistical  interest. 

The  following  sections  are  chosen  to  show  the  methods  and  problems  of 
special  interest  which  are  to  be  found  currently  in  geology.  The  references 
given  will  lead  the  reader  further.  Exploration  and  resource  estimation  and 
exploitation  is  ignored  here  but  partly  covered  in  geostatistics  (q.v.). 

2.  DATA  BANKS 

Efforts  are  being  made  to  computerize  data  so  it  can  be  accessed  easily 
—  see,  e.g.,  Chayes  (1979)  for  igneous  petrology.  Much  interesting  data  is 
unavailable  because  of  its  economic  value  to  those  who  possess  it. 

3.  STOCHASTIC  MODELS 

This  field  is  very  wide  indeed.  Earth  movements  lead  to  an  interest  in 
the  growth  of  cracks  or  fractures  —  see,  e.g.,  Vere-Jones  (1977).  The  occur¬ 
rence  and  strength  of  earthquakes  and  volcanic  eruptions  have  been  the  subject 
of  much  point  processes  modelling  —  see,  e.g.,  Adamopoulos  (1976).  Erosion 
and  sedimentation  require  a  knowledge  of  particle  size  distributions  (q.v.). 
Models  for  forming  sands  and  powders  often  lead  to  the  lognormal  (e.g.,  Kolmo- 
goroff,  1941)  and  Weibull  (see,  e.g.,  Kittleman,  1964)  distributions.  Con¬ 
siderations  of  the  transport  and  deposition  of  sand  (see,  e.g.,  Bagnold,  1954; 
Sen  Gupta,  1975)  lead  to  other  distributions  and  stochastic  processes.  Kolmo- 
goroff  (1949)  first  modelled  the  deposition  and  subsequent  erosion  of  sediments. 
His  model  was  studied  further  by  Hattori  (1973).  This  theory  is  distinct  from 
the  literature  that  tries  to  fit  Markoff  chains  (q.v.;  see  below)  to  the  suc¬ 
cession  of  beds  according  to  their  composition  rather  than  thicknesses,  though 


Hattori  deals  with  both  approaches.  The  present  writer  regards  the  "Markoff 
approach"  as  more  data  analysis  than  modelling.  Grenander  (1975)  has  provided 
a  stationary  stochastic  model  (q.v.)  on  the  circle  (which  is  easily  generalized 
to  the  sphere)  for  the  height  of  the  land  surface.  Erosion  is  modelled  by 
diffusion  which  always  smooths,  and  inequalities  are  maintained  by  uplifts  at 
random  time  points  described  by  random  independent  functions  (random  function, 
q.v.). 

The  study  of  streams  in  drainage  basins,  their  lengths  and  topology  is 
fascinating  --  see,  e.g.,  Dacey  and  Krumbein  (1976). 

The  distribution  of  elements  has  a  large  literature,  but  statistical 
models  to  explain  them  are  few.  Kawabe  (1977)  gives  a  model  and  a  literature 
list,  including  references  to  papers  by  Ahrens  (1963),  who  felt  lognormal  (q.v.) 
distributions  of  elements  were  a  law  of  nature. 

4.  DATA  ANALYSIS 

Nowadays,  all  the  common  statistical  procedures  are  used  widely.  Most 
data  is  observational.  One  collects  rocks  where  they  happen  to  be  exposed 
and  accessible,  so  the  problems  of  "non-random  samples"  are  very  serious. 

The  earth  is  a  sample  of  one.  The  list  below  gives  leads  to  areas  of  particular 
Interest. 

Clustering  Methods  (q.v.) 

Dendrograms  (q.v.)  and  other  methods  are  often  used  to  relate  fossils. 


rocks,  etc.,  to  help  explain  their  evolution.  Petrofabric  and  other  studies 
yield  orientations  plotted  on  a  sphere.  Deciding  whether  the  points  fall  in 
groups  or  clusters  is  a  common  problem  but  may  be  attacked  differently. 
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Factor  Analysis 

Factor  analysis  (q.v.)  is  widely  used  in  palaeontology  and  elsewhere  -- 
see,  e.g.,  JOreskog,  Klovan  and  Reyment  (1976).  Temple  (1978)  gives  a  very 
critical  review.  In  the  analysis  of  data  on  the  sedimentary  composition  of  a 
closed  basin  the  factors  might  be  the  few  inputs  to  the  basin.  In  the  fossil 
content  of  oceanic  cores,  they  might  be  the  depositorial  climates  —  tropical, 
polar,  etc.  This  latter  problem  has  recently  been  studied  differently  by 
Sachs,  Siegel  and  Goldburg  (unpublished  Princeton  report). 

Markov  Chains 

In  studying  the  succession  of  different  lithologies,  often  a  small  set 
(e.g.,  sand,  silt  and  clay)  recurs  in  a  partially  cyclic  way.  It  might  be 
that  the  failure  to  be  strictly  cyclic  is  due  to  the  complete  erosion  of  some 
parts  of  the  record.  See  Casshyap  (1975),  Miall  (1973). 

Bounded  Sum  (or  Closed)  Data 

Bounded  sum  (or  closed)  data,  e.g.,  the  proportions  Pj»  P2 . P^ 

the  k  constituents  of  a  rock.  It  is  natural  to  study  such  data  to  see  if 

the  relative  amounts  of  substances  1  and  2  are  associated.  The  facts  that 
k 

P1  +  p2  <  I’  2  Pi  =  1  make  ^e  usual  methods  invalid.  See  Chayes  (1971) 

and  Darroch  and  Ratcliff  (1978)  for  later  work. 

Orientation  Data 

Normals  to  bedding  planes,  the  directions  of  cracks,  and  joints  provide 
examples  of  axial  data  and  the  flow  of  glaciers,  directions  of  magnetization, 
examples  of  Direction  Data  Analysis  (q.v.).  See  Watson  (1970),  McElhinny  (1973). 
Time  Series  Analysis 

Times  Series  Analysis  (q.v.)  is  basic  to  seismological  data  processing. 

See  Bril  linger  (1975),  Tukey  (1965). 
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